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ABSTRACT 

Three billion years after the big bang (at redshift z — 2), half of the most massive galaxies were already old, 
quiescent systems with little to no residual star formation and extremely compact with stellar mass densities at 
least an order of magnitude larger than in low-redshift ellipticals, their descendants. Little is known about how they 
formed, but their evolved, dense stellar populations suggest formation within intense, compact starbursts 1-2 Gyr 
earlier (at 3 < z < 6). Simulations show that gas-rich major mergers can give rise to such starbursts, which produce 
dense remnants. Submillimeter-selected galaxies (SMGs) are prime examples of intense, gas-rich starbursts. With a 
new, representative spectroscopic sample of compact, quiescent galaxies at z — 2 and a statistically well-understood 
sample of SMGs, we show that z = 3-6 SMGs are consistent with being the progenitors of z = 2 quiescent 
galaxies, matching their formation redshifts and their distributions of sizes, stellar masses, and internal velocities. 
Assuming an evolutionary connection, their space densities also match if the mean duty cycle of SMG starbursts is 
42+ 4 2 ° 9 Myr (consistent with independent estimates), which indicates that the bulk of stars in these massive galaxies 
were formed in a major, early surge of star formation. These results suggest a coherent picture of the formation 
history of the most massive galaxies in the universe, from their initial burst of violent star formation through their 
appearance as high stellar-density galaxy cores and to their ultimate fate as giant ellipticals. 

Key words: cosmology: observations - galaxies: evolution - galaxies: high-redshift - galaxies: starburst - Galaxy: 
formation - submillimeter: galaxies 
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1. INTRODUCTION 

One of the most remarkable discoveries in galaxy evolution 
studies in the past years is that up to half of the most massive 
galaxies (log(M*/Mo) > 11) at z ^ 2 are old, quiescent sys- 
tems with extremely compact structures, characteristics that cor- 
respond to stellar densities that are orders of magnitude higher 
than what is seen in local elliptical galaxies (e.g., Toft et al. 
2007; van Dokkum et al. 2008; Szomoru et al. 2012). Much 
effort has gone into confirming their extreme properties and in- 
vestigating their evolutionary path to the local universe. Virial 
arguments and simulations indicate that the most important pro- 
cess is likely to be minor dry merging (e.g., Bezanson et al. 
2009; Oser et al. 2012; Cimatti et al. 2012; Toft et al. 2012), 
but observations suggest that other processes are likely also 
important, e.g., the continuous addition of increasingly larger, 
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newly quenched galaxies to the quenched population with de- 
creasing redshift (e.g., Newman et al. 2012; Carollo et al. 2013; 
Cassata et al. 2013; Krogager et al. 2013). The formation path of 
these extreme systems is largely unknown. Simulations indicate 
that highly dissipational interactions on short timescales provide 
plausible mechanisms for creating compact stellar populations, 
through either major mergers (e.g., Naab et al. 2007, 2009) or 
dynamical instabilities fed by cold gas accretion (Dekel et al. 
2009). A possible scenario is major gas-rich mergers at high 
redshift (Wuyts et al. 2010) in which the gas is driven to the 
center, igniting a massive nuclear starburst followed by an ac- 
tive galactic nucleus (AGN)/QSO phase that quenches the star 
formation and leaves behind a compact remnant (Sanders et al. 
1988; Hopkins et al. 2006; Wuyts et al. 2010). This is consistent 
with local stellar archaeology studies that imply that massive el- 
lipticals must have short formation timescales of less than 1 Gyr 
(e.g., Thomas et al. 2005). 
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Several authors have pointed out that submillimeter galaxies 
(SMGs) may be examples of the above scenario (e.g., Blain 
et al. 2004; Chapman et al. 2005; Tacconi et al. 2006, 2008; 
Toft et al. 2007; Cimatti et al. 2008; Capak et al. 2008; 
Schinnerer et al. 2008; Coppin et al. 2008; Michalowski et al. 
2010a; Smolcic et al. 2011; Ricciardelli et al. 2010), but see 
Riechers (2013) for a counterexample. The SMG population 
is dominated by galaxies undergoing intense, dust-enshrouded 
starbursts. A large fraction of SMGs with measured CO profiles 
show double-peaked profiles, evident of ongoing major mergers 
or rotation (Frayer et al. 1999; Neri et al. 2003; Sheth et al. 
2004; Kneib et al. 2005; Greve et al. 2005; Tacconi et al. 2006; 
Riechers et al. 2011b; Ivison et al. 2013; Fu et al. 2013). The 
autocorrelation length of SMGs is similar to that of optically 
selected QSOs, suggesting that SMGs and QSOs live in similar 
mass haloes and that the ignition of a QSO could be the event 
that quenches the star formation in SMGs (Flickox et al. 2012). 
This is consistent with observations suggesting that the hosts 
of the most luminous QSOs, i.e., those likely associated with 
the formation of massive quiescent galaxies, are found to be 
primarily major mergers (Treister et al. 2012; Riechers et al. 
2008), a result that is corroborated by Olsen et al. (2013), 
who find that luminous AGNs in massive z ~ 2 galaxies must 
be triggered by external processes. Interestingly, Olsen et al. 
(2013) also find evidence for low-luminosity AGNs in the vast 
majority of massive quiescent galaxies at z ~ 2, suggesting 
that AGNs play active roles in the quenching of their star 
formation. The correlation length of SMGs is similar to that 
of z ~ 2 galaxies with M* > 5 x 10 10 M 0 (ro = 7.66 ± 0.78), 
while z ~ 2 galaxies with M* > 10 1 1 M 0 cluster more strongly 
(r 0 = 11.49 ± 1.26; Wake et al. 2011). 

Recent advances in near-infrared (NIR) spectroscopy have 
made it possible for the first time to accurately constrain 
the age, dust content, and past star formation history of the 
brightest z ~ 2 quiescent galaxies through absorption line 
diagnostics and spectral fitting in the rest-frame optical (Kriek 
et al. 2009; Onodera et al. 2010, 2012; van de Sande et al. 
2011, 2012; Toft et al. 2012). These galaxies have spectra 
typical of poststarburst galaxies, with no detected emission 
lines. However, they have strong Balmer absorption lines, 
which suggests that they underwent major starbursts that were 
quenched 1-2 Gyr prior to the time of observation (i.e., 
at 3 < z < 6). Several of these galaxies show evidence of 
significant dust abundance (with Ay values up to ~1 mag), and 
they are baryon dominated, as is the case for local poststarburst 
galaxies (Toft et al. 2012). In combination with their extremely 
compact stellar populations, these observations suggest that the 
majority of the stars in z ~ 2 quiescent galaxies formed in 
intense, possibly dust-enshrouded nuclear starbursts, a scenario 
very similar to what is observed in z ~ 2 SMGs. 

Velocity dispersions of z ~ 2 quiescent galaxies measured 
from the width of absorption lines are in the range of 
300-500 km s _1 (e.g.. Toft et al. 2012; van de Sande et al. 
2012), which is significantly higher than what is found in lo- 
cal ellipticals of similar stellar mass but comparable to the 
FWHM of molecular lines in 2 < z < 3 SMGs (in the range 
of 350-800 km s -1 , with a mean equivalent rotational veloc- 
ity of (v c ) = 392 ± 134 km s -1 ; Tacconi et al. 2006). The 
line-emitting gas of SMGs, as traced by high-./ CO lines, 
is found to be spatially very compact, with a mean size of 
(R e ) = 2.0 ± 0.3 kpc (Tacconi et al. 2006), which is compa- 
rable to the mean spatial extent ( R e ) = 1.96 ± 0.8 kpc of the 
stellar populations in the quiescent z ~ 2 galaxies (Krogager 


et al. 2013). We note, however, that studies of lower-/ CO 
lines suggest that some SMGs may have more extended 
CO disks (Ivison et al. 2011; Riechers et al. 2011c). The me- 
dian dynamical mass measured from CO(l-O) for z ~ 2 SMGs, 
(Afdyn) = (2.3 ± 1.4) x 10 11 M 0 (Ivison et al. 2011), is simi- 
lar to that measured for z ~ 2 quiescent galaxies, (Md yn ) = 
(2.5 ± 1.3) x 10 11 M q (Toft et al. 2012). 

Despite the many similarities between SMGs and z ~ 2 qui- 
escent galaxies, a major obstacle in establishing an evolutionary 
link between the two galaxy types is their similar redshift dis- 
tribution. While the quiescent nature and derived ages for z ~ 2 
quiescent galaxies suggest they formed at z > 3, the peak of 
the known SMG population was until recently found to be at 
Z ~ 2, with very few examples known at z > 3 (e.g.. Chapman 
et al. 2005), a fact that renders an evolutionary link between the 
two populations unlikely. Recently, however, improved selec- 
tion techniques have uncovered a substantial tail stretching out 
to redshifts of z ~ 6 (Capak et al. 2008; Schinnerer et al. 2008; 
Daddi et al. 2009a, 2009b; Knudsen et al. 2010; Carilli et al. 
2010, 201 1; Riechers et al. 2010, 2013; Cox et al. 201 1; Combes 
et al. 2012; Yun et al. 2012; Smolcic et al. 2012a; Michalowski 
et al. 2012b; Hodge et al. 2012, 2013a). 

In this article we present evidence for a direct evolutionary 
link between the two extreme galaxy populations by comparing 
the properties of two unique samples in the COSMOS field: (1) a 
spectroscopically confirmed, representative sample of compact 
Z ~ 2 quiescent galaxies with high-resolution Hubble Space 
Telescope (HST)/ WFC3 imaging, and (2) a statistical sample 
of z > 3 SMGs. In Section 2 we introduce the samples, and in 
Section 3 we present our results. In particular, in Section 3. 1 we 
show that the distribution of formation redshifts for the z ~ 2 
galaxies is similar to the observed redshift distribution of z > 3 
SMGs, and in Section 3.2 we compare the comoving number 
densities of the two populations. In Section 3.3 we derive 
structural properties of the z > 3 SMGs, and in Section 3.4 
we show that their stellar mass-size relation is similar to that of 
Z ~ 2 quiescent galaxies. In Sections 3.5 and 3.6 we show that 
the duty cycle of the z > 3 SMG starbursts (which are derived 
assuming they are progenitors of z ~ 2 quiescent galaxies) is 
consistent with independent estimates and with the formation 
timescale derived for z ~ 2 quiescent galaxies (assuming 
they formed in Eddington-limited starbursts). In Section 4 we 
summarize and discuss the results. 

Throughout this article we assume a standard flat universe 
with = 0.73, Q. m — 0.27, and Hq = 71 km s -1 Mpc -3 . All 
stellar masses are derived assuming a Chabrier (2003) initial 
mass function (IMF). 

2. SAMPLES 

2.1. Sample of z > 3 SMGs 

Based on dedicated follow-up studies with sub-mm 
interferometers (PdBI, SMA, CARMA) and optical/mm 
spectroscopy (with Keck/DEIMOS, EVLA, PdBI) toward 
1.1 mm- and 870 /xm-selected sources in the COSMOS field, 
Smolcic et al. (2012a) presented the redshift distribution of 
SMGs. This sample shows a tail of z > 3 SMGs, correspond- 
ing to a significantly larger number density at these high red- 
shifts than found in previous surveys (e.g.. Chapman et al. 2005; 
Wardlow et al. 2011; Yun et al. 2012; Michalowski et al. 2012b). 
A possible reason for the difference is that previous surveys did 
not have (sub-)mm follow-up interferometry and therefore may 
be subject to identification biases. For example, Hodge et al. 
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Table 1 

Sample of z > 3 Submillimeter Galaxies in COSMOS 


z 

r e , NIR 

Note 

tV, FIR 


(kpc) 


(kpc) 


AzTEC 1 

4.64 a 

<2.6 

Unresolved 

1.3 — 2.7 C 

AzTEC 3 

5.299 a 

<2.4 

Unresolved 

<3 ± 2 d 

AzTEC 4 

4.93+T 4 , 3 , 

<2.5 

Unresolved 


AzTEC 5 

3.971 a 

0.5 ±0.4 

//ST/WFC3 


AzTEC 8 

3.179 a 

<3.0 

Unresolved 


AzTEC 10 

9 7Q+186 

z * /y -1.29 

0.7 ±0.1 



AzTEC 11-S 

>2.58 b 


Not detected 


AzTEC 13 

>3.59 b 


Not detected 


AzTEC 14-E 

>3.03 b 


Not detected 


AzTEC 15 

9 1 7+0.29 
' -0.37 

5.0 ±0.8 

Very faint 


J 1000+0234 

4.542 a 

3.7 ±0.2 



Vd- 17871 

4.622 a 

1.3 ±0.4 



GISMO-AK03 

4.757 a 

1.6 ±0.6 

HST/ WFC3 



Notes. The top 11 galaxies constitute the S/N-limited, relatively complete 
statistical sample we use for estimating the comoving number density. The 
bottom two are spectroscopically confirmed z > 3 galaxies that we add to the 
sample for structural analysis only. We refer to Smolcic et al. (2012a) for 
details about the sample. The listed effective radii reported are circularized, i.e., 
r e c = r e , m *Jb/a, where r ejm is the effective radius along the major axis and 
b/a is the axis ratio, tv fir [kpc] are rest-frame FIR sizes from the literature, 
measured from high-resolution mm observations. For easy comparison to the 
NIR effective radii, we here quote Gaussian HWHMs. 
a Spectroscopic redshift. 
b mm-to-radio flux ratio based redshift. 
c Younger et al. (2008). 
d Riechers et al. (2010). 


(2013b) show that many of the galaxies in the Wardlow et al. 
(201 1) sample break up into multiple sources when studied at 
high resolution, which inevitably lead to misidentifications for 
some of the sources. 

Here we use the Smolcic et al. (2012a) sample to estimate the 
comoving number density and other properties of z > 3 SMGs. 
Our starting point is a 1.1 mm-selected sample, drawn from the 
AzTEC/JCMT 0.15 deg 2 survey of the COSMOS field (Scott 
et al. 2008) and observed with the Submillimeter Array (SMA) at 
890 /xm and ~2" angular resolution in order to unambiguously 
associate multiwavelength counterparts (Younger et al. 2007, 
2009). The 17 SMGs identified by the SMA follow-up form 
a statistical sample as they are drawn from a signal-to-noise 
limited (S/N; lmm > 4.5), and flux-limited (fun, > 4.2 
mJy), 1.1 mm-selected sample; they are also drawn over a 
contiguous area of 0.15 square degrees. We include one more 
SMG in this sample, J 1000+0234 (T Mmm = 4.8 ± 1.5 mJy, 
S/Ni.i mm ~ 3), which is confirmed to be at z = 4.542 (Capak 
et al. 2008; Schinnerer et al. 2008, 2009). Nine out of these 
18 interferometrically detected galaxies have spectroscopic 
redshifts (4 are confirmed to be at z > 3; Capak 2009; Capak 
et al. 2010; Schinnerer et al. 2009; Riechers et al. 2010; A. 
Karim et al., in preparation), while for the remainder precise 
photometric redshifts (oAz/(i+z spec ) = 0.09) were computed by 
Smolcic et al. (2012a). The z > 3 SMGs from this sample are 
listed in Table 1 . The top 1 1 objects constitute our statistical 
sample. We will use these in the following sections to estimate 
the redshift distribution and comoving number density of z > 3 
SMGs. The bottom two objects are additional spectroscopically 
confirmed z > 3 SMGs in the COSMOS field, which we add to 
the sample for structural analysis only. 


The flux-limited sub-mm selection ensures a relatively ho- 
mogenous sample of the most intensely star-forming dust- 
obscured galaxies at z > 3: Due to the negative k-correction, 
the sub-mm flux detection limit corresponds roughly to a cut 
in star formation rate (SFR) over the considered redshift range. 
Note that while a fraction of single-dish-detected SMGs break 
up into multiple components when studied with interferometry 
at ^ 2 " resolution, this is only the case for two of the galaxies 
studied here (AzTEC 11 and 14). In the present study we as- 
sume that the close individual components are related and count 
them as one in the number density calculations (thus assuming 
they will eventually merge into one galaxy). As the galaxies 
are not resolved in the MIR-mm photometry, derived properties 
(infrared luminosities, SFRs, dust masses, etc.) pertain to the 
combined system. Neither of the two galaxies are detected in 
the optical-NIR; thus, derived sizes and stellar masses for the 
sample are not affected. 

2.2. Far-infrared Emission of the z > 3 SMGs 

In order to directly constrain the SFRs, dust, and gas masses 
of the z > 3 SMGs, we made use of the (sub)-mm (AzTEC, 
FABOCA, MAMBO, SMA, CARMA, and PdBI) and far- 
infrared (FIR) ( Spitzer MIPS, Herschel PACS, and SPIRE) 
observations of the COSMOS field (Sanders et al. 2007; Futz 
et al. 2011; Oliver et al. 2012; Scott et al. 2008; Aretxaga 
et al. 2011; Bertoldi et al. 2007; Younger et al. 2007, 2008; 
Smolcic et al. 2012a, 2012b). The Herschel data consist of deep 
PACS 100 and 160 /xm observations, taken as part of the PACS 
Evolutionary Probe (PEP; Futz et al. 2011) guaranteed time 
key programme and SPIRE 250, 350, and 500 /xm observations 
taken as part of the Herschel Multitiered Extragalactic Survey 
(HerMES; 19 Oliver et al. 2012). 

PACS and SPIRE flux densities were measured using a point- 
spread function (PSF) fitting analysis (Magnelli et al. 2009; 
Roseboom et al. 2010), guided by the position of sources 
detected in the deep COSMOS 24 /xm observations from the 
Multiband Imaging Photometer (MIPS; Rieke et al. 2004) on 
board the Spitzer Space Observatory (3er ~ 45 /xJy; Le Floc’h 
et al. 2009). We cross-matched our z > 3 SMG sample with 
this MIPS-PACS-SPIRE catalogue using a matching radius of 
2" . Results of these matches were all visually checked. For 
Z > 3 SMGs not included in the MIPS-PACS-SPIRE catalogue 
because of a lack of MIPS counterpart, we compute their PACS 
and SPIRE flux densities using a PSF-fitting analysis guided 
by their positions. Further details of the FIR photometry are 
presented in V. Smolcic et al. (in preparation). 

Among the 13 z > 3 SMGs, 9 have secure mid-/far-infrared 
detections, 2 have tentative mid-/far-infrared detections, and 
2 are undetected at infrared wavelengths. From the FIR-mm 
spectral energy distribution (SED) of the z > 3 SMGs, we infer 
their infrared luminosities and dust masses using the dust model 
of Draine & Fi (2007, hereafter DF07) as described in detail in 
Magnelli et al. (2012). The infrared luminosity (Lir) is derived 
by integrating the best-fitting normalized SED templates from 
the DF07 library from rest-frame 8 to 1000 /xm. From these we 
can accurately estimate the star formation activity of the z > 3 
SMGs, using the standard LiR-to-SFR conversion of Kennicutt 
(1998), assuming a Chabrier IMF: 

SFR [M q yr -1 ] = 10~ 10 Lir [L 0 ], (1) 


19 http://hernies.sussex.ac.uk 
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Table 2 

Far-infrared SED Properties of the z > 3 SMG Sample 



log (M,r 
(M 0 ) 

<7PAH b 

y b 

rj . b 
‘-'min 

log (M dust ) b 
[M 0 ] 

log(Li R ) c 

UgJ 

SFR b - e 
(Mo yr" 1 ) 

log(M gas ) c 

(Mo) 

FIR Detection d 

log(M g as) co f 

(Mq) 

AzTEC 1 

lO.^o* 

0.47% 

0.070 

25.0 

9.1 ±0.1 

13.36 ± 0.09 

2291 ± 528 

11.7 ±0.1 

Secure 


AzTEC 3 

H-2^1 

0.47% 

0.290 

25.0 

9.3 ±0.1 

13.37 ± 0.04 

2344 ± 226 

11.3 ±0.1 

Secure 

10.7 

AzTEC 4 

H-2^1 

4.58% 

0.080 

25.0 

9.6 ± 0.2 

13.25 ±0.15 

1778 ± 733 

11.6 ±0.2 

Tentative 


AzTEC 5 

10-9-0.5 

0.47% 

0.190 

25.0 

9.4 ±0.1 

13.43 ± 0.02 

2692 ± 127 

11.4 ±0.1 

Secure 


AzTEC 8 

ii-n 1 ! 

0.47% 

0.090 

25.0 

9.7 ±0.1 

13.45 ±0.01 

2818 ±66 

11.7 ±0.1 

Secure 


AzTEC 10 

10.53A 

2.50% 

0.060 

5.00 

9.6 ±0.1 

12.58 ±0.10 

380 ± 98 

11.8 ±0.1 

Secure 


AzTEC 11 -S 


0.47% 

0.040 

25.0 

9.6 ±0.1 

13.30 ±0.01 

1995 ±46 


Secure 


AzTEC 13 


4.58% 

0.160 

2.00 

9.9 ± 0.3 

12.70 ± 0.20 

501 ±293 


Upper limits 


AzTEC 14-E 


0.47% 

0.290 

0.70 

9.8 ± 0.2 

12.48 ±0.18 

302 ± 155 


Upper limits 


AzTEC 15 

ll-2^\ 

3.19% 

0.010 

20.0 

9.3 ±0.1 

12.73 ± 0.08 

537 ± 108 

11.4 ±0.1 

Secure 


J 1000+0234 

ion\ 

1.77% 

0.150 

25.0 

9.3 ±0.1 

13.17 ±0.09 

575 ± 275 

11.8 ±0.4 

Tentative 

10.4 

Vd-17871 

ion\ 

4.58% 

0.250 

25.0 

9.1 ±0.1 

13.09 ± 0.06 

1230 ± 182 

11.2 ±0.2 

Secure 


GISMO-AK03 

12.1^\ 

4.58% 

0.290 

3.00 

9.5 ± 0.2 

12.66 ±0.19 

457 ±250 

11.5 ±0.2 

Secure 



Notes. 

a Derived using MAGPHYS from the optical-FIR SED. The errors are the formal errors associated with the fit and do not include systematic errors, which can be up 
to TO. 5 dex; see Section 2.3. 

b The DL07 model describes the interstellar dust as a mixture of carbonaceous and amorphous silicate grains. Flere we list the best-fitting values of its four free 
parameters: (1) which controls the fraction of dust mass in the form of polycyclic aromatic hydrocarbon (PAH) grains. (2) y, which controls the fraction of dust 
mass exposed to a power-law (a = 2) radiation field ranging from 6j Tlm to (7 max ; the rest of the dust mass (i.e., 1 — y ) being exposed to a radiation field with a constant 
intensity Umin- (3) f/min. which controls the minimum radiation field seen by the dust (t/ ma x is fixed to a value of 10 6 ). (4) /V/ ( i llst , which controls the normalization of 
the SED. 

c Quantities derived from the best-fitting DL07 models (see Section 2.2). 

d “Secure”: The source is relatively isolated and detected at S/N > 3. “Tentative”: The source is detected at S/N > 3, but the flux density estimates may be affected 
by bright closeby objects. “Upper limit”: The source is not detected at S/N > 3. 

e SFRs are notoriously model dependent, e.g., from a detailed analysis of all available data for AzTEC-3, and assuming a top-heavy IMF, Dwek et al. (2011) found a 
significantly lower SFR than derived here from the DL07 fits. 

f Gas masses derived from CO observations (Schinnerer et al. 2008; Riechers et al. 2010). 


Finally, we estimate the gas masses of the sample through 
log (M gas /M dust ) = —0.85 * Z + 9.4 (Leroy et al. 2011), where 
Z = 2.18 x log(M*) —0.0896 * log(M*) 2 — 4.51 (Erb et al. 
2006; Genzel et al. 2012). 2U This method has been used suc- 
cessfully in the local universe (e.g., Leroy et al. 2011; Bolatto 
et al. 2011), as well as at high redshift (Magdis et al. 2011, 
2012; Magnelli et al. 2012). Assumptions and limitations of 
this method in the case of high-redshift galaxies are extensively 
discussed in Magnelli et al. (2012). Results of the FIR SED 
fits and derived quantities are summarized in Table 2 and used 
in the following analysis to establish an evolutionary link be- 
tween z > 3 SMGs and quiescent z ~ 2 galaxies. The derived 
gas masses are comparable to or larger than the derived stel- 
lar masses: (/ g ) = (M gas /(M* + M gas )| = 0.71 ± 0.03, in agree- 
ment with the high gas fractions found in previous studies of 
high-redshift SMGs (e.g., Carilli et al. 2010; Riechers et al. 
2011c). We caution, however, that gas masses estimated from 
FIR SED fits are relatively uncertain (potentially up to a factor 
of 5-10). For example, in Table 2 we list gas masses for two 
objects in our sample that have independent estimates derived 
from CO line emission. These are significantly different from 
our SED estimates. The main factors contributing to the uncer- 
tainties in the SED estimates are that the sub-mm measurements 
do not trace cold gas very well, in which case the (sub)-mm/ 
CO flux ratio is much lower than in the starburst nucleus but 


20 Thus making the assumption that the mass-metallicity relation at z ~ 2 
applies to galaxies at z > 3. 


where a lot of the gas mass may reside. The other factors are the 
metallicity correction (which has a large scatter) and the assump- 
tion about the gas-to-dust ratio. The main factors contributing 
to the uncertainty of the CO measurements are the assumed 
«co, which can be uncertain by a factor >2, and the excitation 
corrections, which can be uncertain by a factor of ~2M-. 

2.3. Stellar Mass Estimates for the z > 3 SMGs 

We estimate stellar masses of the z > 3 SMGs from their 
UV-MIR (8 jim) broadband photometry as described in V. 
Smolcic et al. (in preparation). Briefly, stellar masses were de- 
rived by fitting the observed broadband UV-MIR SEDs with 
the MAGPHYS code (da Cunha et al. 2008). The stellar compo- 
nent in the MAGPHYS models is based on Bruzual & Chariot 
(2003) stellar population synthesis models assuming various star 
formation histories (exponentially declining SFHs (with random 
timescales) + superimposed stochastic bursts) and a Chabrier 
(2003) IMF. The stellar masses for the SMGs and their formal 
uncertainties drawn from the probability distribution function 
(generated from the x 2 fit values by MAGPHYS) are given in 
Table 2. We note, however, that stellar masses for SMGs are 
strongly dependent on the assumed star formation histories and 
may lead to systematic discrepancies of ±0.5 dex given differ- 
ent assumptions and stellar population synthesis models (see 
Table 1 in Michalowski et al. 2012a) and whether or not emis- 
sion lines are included in the templates (Schaerer et al. 2013). 
For example, using the double SFHs implemented in GRASIL 
(Silva et al. 1998; Iglesias-Paramo et al. 2007), we find 
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systematically higher stellar masses, consistent with the results 
from Michalowski et al. (2010b). On the other hand, dynami- 
cal mass considerations based on CO line observations for two 
objects in our sample (Schinnerer et al. 2008; Riechers et al. 
2010) suggest lower stellar masses than inferred by MAGPHYS. 
Therefore, here we adopt the middle values, i.e., the stellar 
masses computed by MAGPHYS+BC03, noting that these may 
be subject to systematic uncertainties. 

2.4. Sample of z ~ 2 Compact Quiescent Galaxies 

It is well established from deep multiwaveband photometric 
surveys that a substantial population of quiescent massive 
galaxies with extremely compact structure exists at z ~ 2 (Daddi 
et al. 2005; Toft et al. 2005, 2007, 2009; Trujillo et al. 2006; 
Franx et al. 2008; Williams et al. 2010; van Dokkum et al. 2010; 
Brammer et al. 2011; Szomoru et al. 2011, 2012; Damjanov 
et al. 2011; Newman et al. 2012; Cassata et al. 2013). Samples 
of spectroscopically confirmed, z ~ 2 quiescent galaxies with 
accurate stellar population model fits and high angular resolution 
space-based NIR imaging are much more sparse (van Doklcum 
et al. 2008). As a high-quality comparison set to the z > 3 SMGs 
we use the sample of Krogager et al. (2013, K13 hereafter). 
This sample consists of 16 spectroscopically confirmed massive 
quiescent galaxies, selected from the 3DHST survey in the 
COSMOS field by requiring strong 4000 A breaks in the 
grism observations. As shown in K13, this effectively selects 
a representative sample of massive (log M > 10.9) quiescent 
galaxies around z ~ 2. The high S/N grism spectra around the 
break in combination with multiwaveband photometry from the 
COSMOS survey allows for strong constraints on the stellar 
populations including stellar masses, dust contents, mean stellar 
ages, i.e, the time elapsed since the last major episode of star 
formation, as well as formation redshifts (derived from the 
stellar ages). The sample is also covered by high-resolution 
NIR imaging with //ST/WFC3 from the CANDELS survey 
(Grogin et al. 2011; Koekemoer et al. 2011), yielding accurate 
constraints on the rest-frame optical surface brightness profiles 
and effective radii (r c ). 

3. RESULTS 

3.1. Redshift Distributions 

From the spectroscopic redshifts and mean stellar ages 
available for the quiescent z ~ 2 galaxy sample described in 
Section 2.4 we can estimate the distribution of their formation 
redshifts. In Figure 1 this distribution is compared to the 
observed redshift distribution of the sample of z > 3 SMGs 
described in Section 2.1. Due to the small number of galaxies 
in both samples, a one-to-one correspondence is not expected. 
However, we stress that the two distributions are similar, with 
a peak at z ~ 3 and a tail toward higher redshifts. A two- 
sample Kolmogorov-Smirnov (K-S) test yields a statistic of 
0.29 with a /7-value of 54%, and it is thus not inconsistent with 
the two redshift distributions being drawn from the same parent 
distribution. 

3.2. Comoving Number Densities 

The next step in establishing an evolutionary connection be- 
tween z > 3 SMGs and quiescent galaxies at z ~ 2 is to compare 
their comoving number densities. The comoving number density 
of massive quiescent galaxies as a function of redshift is well 
constrained from photometric redshift and stellar population 



Figure 1. Comparison of the redshift distribution of z > 3 SMGs and the 
formation redshifts of z ~ 2 quiescent galaxies. The red histogram shows the 
distribution of formation redshifts estimated for a spectroscopically confirmed 
sample of compact quiescent galaxies at z ~ 2 from their observed redshift 
and derived luminosity-weighted ages of their stellar populations (K13). The 
blue histogram shows the distribution of redshifts of the statistical sample 
of z > 3 SMGs. Note that the galaxies that only have lower limits on the 
redshifts have been placed in the bins corresponding to those limits, and that 
the histogram includes two galaxies where the best-fitting redshift is slightly 
below 3, but for which a z > 3 photometric redshift solution falls within the 99% 
confidence interval. The smooth curves show probability density distributions 
(kernel density estimates (KDEs)) of the two populations. The two redshift 
distributions are similar, consistent with the hypothesis that z > 3 SMGs are the 
direct progenitors of z ~ 2 compact quiescent galaxies (cQGs). 

(A color version of this figure is available in the online journal.) 

model fits to deep multiwaveband photometry (e.g., Williams 
et al. 2010; Brammer et al. 2011). Here we estimate a comov- 
ing number density of (6.0 ±2.1) x 10~ 5 Mpc -3 for quiescent 
galaxies at z ~ 2 with log(M/M 0 ) > 11 as the mean of the 
densities measured at z = 1.9 and z = 2.1 by Brammer et al. 
(201 1). The error includes a contribution of cosmic variance of 
12% (Moster et al. 201 1). 

To derive the surface number density of z > 3 SMGs, we 
take all SMGs from the 1 . 1 mm-selected COSMOS sample that 
could lie at z > 3 given their lower or upper 99% confidence 
levels of the photometric redshift (reported in Table 4 in 
Smolcic et al. 2012a). We then derive an average value of 
the surface density by taking the most probable photometric 
redshift (or spectroscopic redshift where available), 21 and the 
lower 22 and upper 23 surface density values by taking the limiting 
redshifts corresponding to the 99% confidence intervals of the 
photometric redshifts. This yields a surface density of z > 3, 
bright (±i.i mm > 4.2 mJy) SMGs of 60 ± 10 deg -2 . Note that 
conservatively excluding from the analysis all three SMGs in the 
sample that are not significantly detected at other wavelengths 
(AzTECl IS, AzTEC 13, AzTEC 14E), and thus only have lower 
redshift limits, yields a surface density of 40 deg -2 . 

The derived surface density values for z > 3 SMGs may be 
subject to systematic effects. The completeness of the AzTEC/ 
ICMT COSMOS survey, shown in Figure 8 in Scott et al. (2008), 
is roughly 50%, 70%, and 90% at /q.imm = 4.2, 5, and 6 mJy, 
respectively. Taking this into account in combination with the 
deboosted 1.1 mm fluxes of the SMGs (see Younger et al. 2007, 


2 1 Taking the most probable photometric redshift reveals that nine SMGs 
(AzTEC f, AzTEC3, AzTEC4, AzTEC5, AzTEC8, AzTEC13, AzTEC14E, 
AzTEC 15, and J 1000+0234) are at z > 3. 

22 In this case eight SMGs (AzTECl, AzTEC3, AzTEC4, AzTEC5, AzTEC8, 
AzTEC 13, AzTEC 14E, and J 1000+0234) are at z > 3. 

23 In this case 10 SMGs (AzTECl, AzTEC3, AzTEC4, AzTEC5, AzTEC8, 
AzTEC 10, AzTECl IS, AzTEC13, AzTEC14E, and J1000+0234) are at z > 3. 
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Figure 2. Gallery of z > 3 SMGs. Top: NIR images, 8" on a side. For AzTEC5 and GISMO- AK03 we show HST/ WFC3 F160W images from the CANDELS survey. 
For the rest we show stacked Y - , ,/-. //-. and Si-band images from the UltraVISTA survey. Middle: Sersic n = 1 galfit models of the 2D surface brightness distributions 
of the SMGs and their nearby companions. Bottom: Residual images, i.e., the original images shown in the top panel, subtracted the best-fitting models in the 
middle panel. 


2009) reveals that the derived surface densities could be roughly 
a factor of 1.5 higher than that reported above. On the other 
hand, the AzTEC/JCMT COSMOS field may be overdense 
(Austermann et al. 2009), which would imply that the true z > 3 
SMG surface density averaged over a larger area would be lower. 

Our best estimate of the comoving number density of 
3 < z < 6 SMGs is (2. 1 ± 0.4) x 10~ 6 Mpc~ 3 , which is signif- 
icantly lower than the space density of z ~ 2 quiescent galaxies. 
This is expected as z > 3 galaxies only enter the mm-selection 
criterion during their intense starburst phase. In Section 3.5 we 
use the observed difference in comoving number densities to 
constrain the duty cycle of the SMG starbursts. 

3.3. Rest-frame UV-Optical Structure of z > 3 SMGs: 
Disks, Spheroids, or Mergers? 

The high redshifts and large amounts of dust in the z > 3 
SMGs render them extremely faint in the rest-frame UV and 
optical, despite their high stellar masses and SFRs. This makes 
it challenging to constrain their structure. To achieve the least 
biased estimates of the distribution of stellar mass, one would 
need to study the surface brightness distributions in the rest- 
frame optical/NIR, or as close to these wavebands as possible. 
Ideally, the observations would be done in the observed mid- 
infrared, but at the low spatial resolution of current facilities 
(e.g., Spitzer ) the galaxies remain unresolved. Until James Webb 
Space Telescope becomes operational, the best that can be 
achieved is to study the galaxies in the observed NIR. For most 
of the galaxies in the sample this wavelength range probes rest- 
frame wavelengths around the 4000 A break and thus should be 
a relatively good tracer of the stellar mass distribution. For two 
galaxies (AzTEC5 and GISMO-AK03), we use space-based 
NIR imaging with HST/ WFC3, which is available from the 
CANDELS survey. This is preferable to ground-based imaging 
given the higher resolution (FWHM ~ 0'.'2). For the remaining 
galaxies we use deep NIR imaging provided by the Ultra Vista 
survey (5er AB depths range from 23.7 in the K band to 24.6 
in the Y band; McCracken et al. 2012). The resolution of 
these observations is lower (FWHM ~ O'.' 8), but it has been 
demonstrated that relatively unbiased sizes (down to a fraction 
of the FWHMpsf) can be derived from such data when the S /N 
is high and the PSF is well known (e.g., Trujillo et al. 2006; Toft 
et al. 2009; Williams et al. 2010). To increase the S/N, we stack 
the Y-, J-, H- , and k-hand images. 

Postage stamp images of the galaxies are shown in the top 
panel of Figure 2. NIR counterparts of 10 of the 13 sources 
are detected, 8 of which have relatively high S/N (the faintest 
ones have S/N ~ 10). We fit 2D Sersic models to their surface 
brightness distributions with galfit (Peng et al. 2002), using 


similarly stacked images of nearby stars as PSF models. We find 
the Sersic n to be relatively poorly constrained from the data. 
Leaving it free in the fits in all cases results in low values n < 2, 
with a median value of (n) = 0.6 ± 0. 1, but with relatively large 
errors. To limit the degrees of freedom in the fits, we therefore 
fix it to n = 1. The reduced yf of these fits are in all cases 
similar to those with n free, and they are better than fits with n 
fixed to 4. 

The best-fitting effective radii, encompassing half the light of 
the model, are reported in Table 1 . Half of the detected galaxies 
(five) have close companions. In these cases we model both 
components simultaneously and report the parameters for the 
main component (closest to the center of the mm emission). 
Also listed in Table 1 are rest-frame FIR sizes for two galaxies 
in our sample derived from interferometric sub-mm imaging 
observations. These agree with the sizes derived from the NIR 
data. The rest-frame FIR sizes directly measure the extent of 
the star-forming regions, which we hypothesize evolves into the 
compact stellar populations at z = 2; thus, the agreement is 
encouraging. 

Our analysis shows that apart from being very compact, 
the z > 3 SMGs are not isolated, smooth, single-component 
galaxies. All the detected galaxies show evidence of close 
companions or clumpy substructure (see Figure 2). From these 
observations alone, it is not possible to deduce whether this is 
due to chance projections, ongoing minor/major mergers, or 
perhaps multiple star-forming regions in individual galaxies, 
as resolved photometry and spectroscopy are not available. 
We note, however, that the two galaxies with Z/5T/WFC3 
data appear to have well-separated individual components of 
comparable brightness, favoring the merger interpretation. This 
is consistent with direct observational evidence for SMGs 
being major mergers, i.e., having multiple close components 
at the same redshift (e.g., Fu et al. 2013; Ivison et al. 2013). 
Simulations suggest that the timescale for major mergers is 
typically 0.39 ± 0.30 Gyr (Lotz et al. 2010). The cosmic time 
available between the observed epoch of the SMGs at z — 3-6 
and their proposed remnants at z = 2 is 1-2 Gyr. If (some of) the 
SMGs are major mergers, there is thus sufficient time available 
for them to coalesce to a single quiescent remnant at z = 2. 

In the local universe most star-forming galaxies are well fit by 
exponential disk profiles corresponding to n = 1 (Wuyts et al. 
2011), while irregular galaxies and (precoalescence) mergers 
are often best fit by models with lower n-values (n < 1). At 
the S /N and resolution of the galaxies in the Ultravista data the 
confidence in derived Sersic parameters is limited. However, the 
persisting low values found for the whole z > 3 SMG sample, 
including the two galaxies with the higher resolution HST/ 
WFC3 data, suggest that the galaxies are more consistent with 
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Figure 3. Comparison of the stellar mass-size plane of z > 3 SMGs, z ~ 2 
quiescent galaxies, and local galaxies. The red points represent z ~ 2 quiescent 
galaxies. Black points represent z > 3 SMGs. For the latter, the solid error 
bars represent the errors associated with theMAGPHYS SED fits. The dotted 
error bars are possible systematic errors that extend to values that we derive 
using the Michalowski et al. (2010a) templates. The gray cloud shows the 
mass-size distribution of massive local galaxies in the Sloan Digital Sky Survey. 
The mass-size distribution of SMGs is similar to that of z ~ 2 quiescent 
galaxies, significantly offset from the local relation and consistent with a direct 
evolutionary connection between the two populations. 

(A color version of this figure is available in the online journal.) 


disks or mergers than spheroids. A similar conclusion was found 
for a sample of 22 SMGs at 1 < z < 3 with HST / WFC3 data 
(Targett et al. 2013) for which the majority were best fit by low-n 
Sersic models, with a mean («} = 1.2 ± 0.1. If z > 3 SMGs 
are progenitors of z ~ 2 quiescent galaxies, then their evolution 
must include a transformation of their surface brightness profiles 
that increase their Sersic indices, as surface brightness profiles 
of quiescent galaxies at z ~ 2 are more centrally concentrated 
(Wuyts et al. 2011; Szomoru et al. 2011; Bell et al. 2012), e.g., 
the sample of K13 has ( n ) = 4.0 ±0.1. We discuss a possible 
mechanism for this transformation in Section 4. 

3.4. Mass-Size Relation 

Combining the derived stellar masses and effective radii of 
the 3 < z < 6 SMGs, in Figure 3 we compare their stellar 
mass-size distribution to that of z = 2 quiescent galaxies and 
of massive early-type galaxies in the local universe. Two of the 
10 NIR-detected SMGs are relatively extended with effective 
radii comparable to those in local galaxies of similar mass. Both 
of these (AzTEC 10 and AzTEC 15) appear from their NIR 
images to be ongoing mergers. The remaining 8 galaxies are 
extremely compact, with r e < 2.5 kpc. Four are unresolved in 
the Ultravista data. For these we adopt upper limits on their 
effective radii corresponding to 0.5 x FWHMpsf- 

The stellar mass-size distribution of the 3 < z < 6 SMGs 
is similar to that of z ~ 2 quiescent galaxies. Both populations 
are smaller than local galaxies of similar mass by an average 
factor of ~3. From the derived quantities we can infer the mean 
internal stellar mass surface densities within the effective radius 
(Z = 0.5 M^/rtrj,) of the z > 3 SMGs and z = 2 cQGs, which 
we find to be similar: (log(Z)) SMGs ~ 9.9 ± 0.1 M e kpc -2 , 
(log(Z)) C Q Gs ~ 9.8 ±0.1 M q kpc -2 , in both cases more than 
an order of magnitude higher than in local early-type galax- 
ies of similar mass. This is consistent with a picture where the 
SMGs passively evolve into compact quiescent galaxies after 
their starbursts are quenched. 


3.5. Duty Cycle ofSMG Starbursts 

The observed space density of z > 3 SMGs is a factor of 
~30 lower than the space density of z ~ 2 quiescent galaxies 
(see Section 3.2). However, the SMGs only enter the sub- 
mm-selected (F\\ mm > 4.2 mJy) sample during their intense 
starburst phase where they have very high SFRs. The duration of 
this phase, i.e., the duty cycle /burst, which ends when the supply 
of gas is depleted or the star formation is quenched, e.g., by 
feedback from supernovae or AGNs, has been estimated to be 
in the range of 40-200 Myr based on gas depletion timescales 
(Greve et al. 2005; Tacconi et al. 2006; Riechers et al. 2011a) 
and clustering analysis (Hickox et al. 2012). If we assume that 
all of the z > 3 SMGs evolve into z ~ 2 quiescent galaxies, 
and that they only undergo one SMG phase, we can estimate 
the average duty cycle of their starbursts from the observed 
comoving number densities of the two populations as 

tburst — lobs X (// SMG. 7 >3 /±/. “=2 ), (— ) 

where t a b s is the cosmic epoch corresponding to the redshift 
interval 3 < z < 6 from which the z > 3 SMGs are selected. 
Using the comoving number densities, we can thus constrain 
the duty cycle of the SMGs to tb urst = 42* 33 5 Myr. This number, 
however, does not include possible systematic uncertainties on 
the number density of SMGs discussed in Section 3.2. If we 
conservatively assume the two extreme cases where (1) the 
SMG sample is 100% complete, and the field is three times 
overdense; and (2) the sample is a factor of 1.5 incomplete 
but not overdense, the derived timescales are in the range 
of 14 Myr < ?smg < 62 Myr. The systematic uncertainty on 
the timescale is thus of the order of 24 Myr. Therefore, 
our constraints on the average duty cycle in z > 3 SMGs 
IS fburst = 42^9 Myr, where the errors have been added in 
quadrature. This value is consistent with the independently 
estimated duty cycles based on gas depletion timescales, thus 
affirming the idea that z > 3 SMGs are progenitors of z ~ 2 
quiescent galaxies. The derived timescale does not depend 
strongly on the z = 6 upper limit adopted for the SMG redshift 
distribution. Adopting limits of z = 5.5 or z = 7 instead leads 
to timescales of 44 and 37 Myr, respectively. We note that the 
validity of the timescale calculation presented here relies on the 
assumption of a direct evolutionary connection between the two 
populations, implying that all z = 2 quiescent galaxies were 
once z > 3 SMGs and all z > 3 SMGs evolve into z = 2 
quiescent galaxies. 

3.6. Star Formation Rate and Timescale of z = 2 
Quiescent Galaxies during Their Formation 

We can infer a lower limit on the SFR of the z = 2 quiescent 
galaxies during their formation by assuming that they started 
forming stars at z = 10 and did so at a constant rate until their 
inferred formation redshifts. The minimum average SFR needed 
to acquire their observed stellar masses at z = 2 calculated 
in this way is (SFR m ; n ) = 115 ± 5 M 0 yr“ 1 . This is a factor 
of >3 larger than the observed average SFR in star-forming 
Lyman break galaxies (LBGs) at z > 3 (Carilli et al. 2008). 
Furthermore, the space density of z ~ 2 quiescent galaxies with 
log M*/M q > 11 is 5, 10, and >100 times larger than that of 
similar mass LBGs at z = 4, 5, and 6, respectively (Stark et al. 
2009). Their progenitors must therefore have had much larger 
SFRs and are missing from LBG samples. This suggests that 
they must be dust-obscured starburst galaxies. 
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Figure 4. Comparison of the SFRs and starburst timescales derived for the z > 3 SMGs and z ~ 2 quiescent galaxies. Left: The red curve shows the probability 
density distribution (KDE) of SFRs of the z ~ 2 quiescent galaxies during their formation, calculated assuming they formed in Eddington-limited starbursts. The blue 
curves show probability distributions for 1000 realizations of ongoing SFRs in the z > 3 SMGs, estimated from their total infrared luminosity and associated errors. 
The two distributions span the same range, in support of an evolutionary connection between quiescent galaxies and SMGs. Middle: Probability density distribution of 
the duration of the cQG starbursts, calculated assuming that all their observed stellar mass formed in Eddington-limited bursts. The gray area indicates the constraints 
on the duty cycle of the SMG starbursts derived from their number density. Right: Same as the middle plot, but assuming that only half of the z ~ 2 quiescent galaxies’ 
stellar mass formed in the Eddington-limited burst. The two independent measures ?burst are consistent, in agreement with z > 3 SMGs being progenitors of z ~ 2 
quiescent galaxies. 

(A color version of this figure is available in the online journal.) 


Based on the observed line widths and compact spatial extent 
of molecular line-emitting regions, SMGs are often argued to 
be maximum starbursts, i.e, they form stars at a rate close to the 
Eddington limit. Assuming a spherical symmetric geometry, an 
isothermal sphere density structure, a small volume filling factor 
for molecular gas, and a Chabrier IMF based on Thompson et al. 
(2005), Younger et al. (2010) approximate this “maximum SI R” 
as 

SFRmax = 480 a 4 2 00 D kpc /c^ [M q yr~ 1 ] , (3) 

where 0-400 is the line-of-sight gas velocity dispersion in units 
of 400 km s -1 , /(Too is the opacity in units of cm 2 g” 1 (usually 
taken to be sal; Murray et al. 2005; Thompson et al. 2005), 
and Z\ pc is the characteristic physical scale of the starburst 
(usually approximated as the Gaussian FWHM of the line- 
emitting region). In Figure 4, the blue curves show probability 
distributions for 1000 realizations of ongoing SFRs in the 
Z > 3 SMGs, estimated from their total infrared luminosity 
and associated errors, through Equation (1). The SMGs are 
forming stars at high rates 500-3000 M 0 yr _1 , close to the 
Eddington limit. For example, Younger et al. (2010) estimated 
the maximum SFR of AzTEC4 and AzTEC8 to be in the range 
of 1900-3800 Mq yr“ 1 , comparable to the values derived here 
(see Table 2). 

In the following, we investigate whether the observed prop- 
erties of z ~ 2 quiescent galaxies are consistent with having 


formed under such conditions. Assuming that z ~ 2 quiescent 
galaxies formed in Eddington-limited maximum starbursts, we 
can estimate the maximum SFR and the duration of this burst 
from the observed size, velocity dispersion, and stellar mass of 
the quiescent remnants. In Figure 4 the red curve shows the dis- 
tribution of SFRmax for the sample of z ~ 2 quiescent galaxies 
described in Section 2.4, calculated from Equation (3), assuming 
AT00 = 1, Z\ pc = 2 r ec (where r ec are the effective radii mea- 
sured for the individual galaxies), and CT400 = (er)/400kms -1 , 
where {a) = 363 ± lOOkms -1 is the mean velocity dispersion 
measured for z ~ 2 quiescent galaxies in the literature (Toft etal. 
2012). We use this mean value as measured velocity dispersions 
for the K13 sample are not available. 

There is a good general correspondence between the SFRmax 
distribution of quiescent z ~ 2 galaxies and the SFR distribution 
of z > 3 SMGs. The SFRmax distribution peaks at higher SFRs 
than the observed distribution in z > 3 SMGs, indicating that 
some of the z ~ 2 quiescent galaxies may have formed in 
starbursts with sub-Eddington SFRs. Also plotted in Figure 4(b) 
is the duration of this “maximum starburst” 

fburst = AM*/SFR M AX (4) 

assuming a constant star formation rate SFR = SFR max during 
the burst, and that all the stellar mass of the z ~ 2 quiescent 
galaxies was created during this burst, i.e., AM* = M*. While 
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Figure 5. Red histograms show the distribution of stellar masses in 
log ! M j /V/ Q : ) >11 quiescent galaxies at z ~ 2. In the top panel the blue his- 
togram shows the distribution of stellar masses in the z > 3 SMGs. In the 
bottom panel the blue histogram shows the final stellar masses of the z > 3 
SMGs assuming that 10% ± 5% of their derived gas mass is turned into stars 
during the remainder of the ongoing starburst (Hayward et al. 2011). 

(A color version of this figure is available in the online journal.) 


consistent within the errors, the mean derived timescale for 
Eddington-limited starbursts is about a factor of two longer 
than the starburst timescale derived from comparing comoving 
number densities. This can be accounted for by changing some 
of the assumptions, e.g., if the SMG starbursts are triggered 
by major mergers, a fraction of the stellar mass must have 
been formed in the progenitor galaxies prior to the merger. 
In Figure 4(c) we show that if we assume that only half of the 
observed stellar mass in z ~ 2 quiescent galaxies was created 
in a z>3 Eddington-limited starburst, i.e., AM* = 0.5 M*, 
there is excellent agreement between the derived timescales, 
consistent with the idea that z > 3 SMGs are the progenitors of 
Z ~ 2 quiescent galaxies. Interestingly, this is consistent with the 
results of Michalowski et al. (2010b), who found that on average 
~45% of the stellar mass in a sample of z > 4 SMGs was formed 
in their ongoing starbursts. If half the stellar mass formed prior to 
the merger that ignites the SMG starburst, an implication is that 
the merger progenitors must have been gas-rich star-forming 
galaxies, in agreement with the high gas fractions found in high 
redshift star-forming galaxies (Tacconi et al. 2013). 

3. 7. Additional Stellar Mass Growth and Quenching 
of the z > 3 SMGs 

The similar mass-size distribution of the z > 3 SMGs and 
Z ~ 2 quiescent galaxies is in agreement with what one would 
expect if the z > 3 SMGs evolve passively into z ~ 2 quiescent 
galaxies, after they have been quenched. Prior to the quenching, 
however, the ongoing starburst will increase the stellar masses 
of the galaxies. In Figure 5 we show that the distribution 
of stellar masses in the z > 3 SMGs is broader than that of 


log (M/M 0 ) >11 quiescent galaxies at z ~ 2. We can estimate 
the growth of stellar mass in the individual z > 3 SMGs 
from their gas masses, inferred from the FIR SED fits (see 
Table 2). From these we can estimate the final stellar masses of 
the z > 3 SMGs if we assume a star formation efficiency, i.e, 
the fraction of gas that is turned into stars during the starburst. 
In the simulations of Hayward et al. (2011) the gas fraction 
decreases from 45% to 40% in isolated disks and from 17.5% 
to 15% in merging galaxies, from the peak of the starburst to 
when it ends, which corresponds to a decrease in gas mass of 5% 
and 15% during this time. If we assume that this gas is turned 
into stars, and that we are observing the SMGs at the peak of 
their starburst, the models thus indicate that ~10% ± 5% of 
the observed gas mass in the z > 3 SMGs will be turned into 
stars during the remainder of the burst. In Figure 5 we compare 
the final stellar mass distribution of the z > 3 SMGs with that 
of quiescent z ~ 2 quiescent galaxies, assuming that 10% of 
the derived gas mass in the z > 3 SMGs is turned into stars 
before the starbursts are quenched. The two distributions are 
similar, with a K-S test statistic of 0.33 and a probability of 
67%, in agreement with a direct evolutionary link between the 
two populations. The mass increase from the time the SMGs 
are observed up to the end of the starburst will likely not 
significantly increase the effective radii as the process is highly 
dissipative, resulting in a slight horizontal shift in the M* — r e 
plane (blue points in Figure 3). The continued starbursts and 
subsequent quenching may also provide the mechanism needed 
to transform the observed low-n disk-like surface brightness 
profiles observed in SMGs to the higher n bulge-like profiles 
observed in quiescent galaxies at z ~ 2 (Wuyts et al. 2011; 
Szomoru et al. 2011; Bell et al. 2012). Most of the stellar mass 
will be added in the nuclear regions of the SMGs, which is 
likely highly obscured by dust. Once the quenching sets in and 
most of the dust is destroyed or blown away, a more centrally 
concentrated surface brightness distribution could be revealed. 
Note that if, as assumed here, only 10% of the large derived 
gas mass in the z > 3 SMGs is turned into stars during the 
remainder of the burst, the following quenching mechanism 
must be highly efficient at heating or expelling the substantial 
amounts of leftover gas. A possible mechanism for expelling the 
gas is through outflows, driven by strong winds associated with 
the maximum starbursts. Tentative evidence for such outflows 
has recently been observed in the 163 /im OH line profile in an 
SMG at z = 6.3 (Riechers et al. 2013). We stress that the large 
systematic uncertainties on the derived stellar masses for the 
SMGs could potentially influence our conclusions. 

4. SUMMARY AND DISCUSSION 

4.1. The Link between z > 3 SMGs and z ~ 2 
Compact, Quiescent Galaxies 

In this article we presented evidence for a direct evolutionary 
connection between two of the most extreme galaxy types in 
the universe, the highest redshift (z > 3) SMGs that host some 
of the most intense starbursts known and quiescent galaxies 
at z ~ 2 that host the densest conglomerations of stellar mass 
known. The comparison was motivated by the recent discov- 
ery of a significant population of SMGs at 3 < z < 6 and 
high-resolution imaging and spectroscopic studies of z ~ 2 qui- 
escent galaxies that show that the majority of their stars likely 
formed in massive nuclear, possibly dust-enshrouded, starbursts 
in this redshift range. From a unique flux-limited statistical sam- 
ple of z > 3 SMGs in the COSMOS field, we have put robust 
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Figure 6. Schematic illustration of the formation and evolutionary sequence for massive galaxies advocated in this article. 
(A color version of this figure is available in the online journal.) 


constraints on their comoving number density, which we then 
put in the context of the comoving number densities of quiescent 
galaxies of similar mass at z ~ 2. If z > 3 SMGs are progen- 
itors of z ~ 2 quiescent galaxies, then our data imply that the 
SMG duty cycle must be tt, urst = 42*1,9 Myr, where the error bars 
include our best estimates of the effects of cosmic variance, pho- 
tometric redshift errors, and incompleteness. This timescale is 
independent from, but in good agreement with, estimates based 
on SMG gas depletion timescales fburst ~ 40-200 Myr, estimates 
from hydrodynamical merger simulations fburst ~ 50 Myr (e.g., 
Mihos & Hernquist 1996; Cox et al. 2008), and estimates based 
on the time compact starburst galaxies spend above the main 
sequence of star formation fb urs t < 70 Myr (Wuyts et al. 201 1). 
Importantly, as our estimate of the SMG starburst timescale 
is based only on number density arguments, it is relatively 
independent of assumptions of the underlying stellar IMF, which 
is a large potential systematic uncertainty, e.g., in depletion 
timescale estimates. 

Based on stellar masses derived from UV-MIR photometry 
and sizes derived from deep NIR imaging, we have shown that 
the mass-size distribution of the z > 3 galaxies is remarkably 
similar to that observed for compact quiescent massive galaxies 
at z ~ 2, and that it has similar mean internal stellar mass surface 
densities (log(E)} ~ 9.8 M G kpc” 2 . The surface brightness dis- 
tributions of the z > 3 SMGs are best fit by Sersic models with 
low Sersic n parameters, which is typical of local star-forming 
disk galaxies or mergers. The majority also show multiple com- 
ponents or irregularities indicative of ongoing merging and/or 
clumpy structures. 

Many similarities between z ~ 2 quiescent galaxies and 
SMGs exist: they have similar stellar masses, characteristic 
internal velocities, dynamical masses, sizes, correlation lengths, 
etc. Millimeter measurements of z > 3 SMGs in continuum and 
CO show signatures of merging or rotation (Younger et al. 2008, 
2010; Riechers et al. 2011b, 2011c), with molecular emission 
line widths in the range of 300-700 km s -1 (with a few outliers) 
and a mean (FWHM) = 456 ± 253 km s -1 (Schinnerer et al. 
2008; Daddi et al. 2009a; Coppin et al. 2010; Riechers et al. 
2010, 2011b, 2011c; Swinbank et al. 2012; Walter et al. 2012) 
similar to stellar velocity dispersions a = 300-500 km s' 1 
measured in z ~ 2 quiescent galaxies. For example, for AzTEC 
3, at z = 5.3, Riechers et al. (2010) measured a CO line width 
of 487 km s -1 and a gas depletion timescale of 30 Myr, similar 
to the SMG starburst timescale derived here. At the depth and 
resolution of the present data, it is impossible to make strong 
claims about how many z > 3 SMGs are in the process of 


merging. However, all the detected galaxies show evidence of 
close companions, multiple components, or clumpy structure 
and have low derived Sersic indices, which is consistent with 
expectations for merging galaxies. In particular, the two galaxies 
with //5T/WFC3 data appear to be major mergers. 

The evidence presented in this article is in support of a di- 
rect evolutionary connection between z > 3 SMGs, through 
compact quiescent galaxies at z ~ 2 to giant elliptical galax- 
ies in the local universe. In this scenario (illustrated in 
Figure 6) gas-rich, major mergers in the early universe trig- 
ger nuclear dust-enshrouded starbursts, 24 which on average last 
42;/ 40 9 Myr, followed by star formation quenching, either due 
to gas exhaustion, feedback from the starburst, or the ignition 
of an AGN, leaving behind compact stellar remnants to evolve 
passively for about a Gyr into the compact quiescent galaxies 
we observe at z ~ 2. Over the next 10 Gyr, these then grow 
gradually, primarily through minor merging, into local elliptical 
galaxies. 

4.2. Connection to Compact Star-forming 
Galaxies at 2.5 < z < 3 

Barro et al. (2013) found a population of relatively massive 
(log(M/M 0 > 10) compact star-forming galaxies (cSFGs) at 
1.4 < z < 3, which show evidence of quenching beginning 
to set in (i.e., lower specific SFRs than typical star-forming 
galaxies and increased AGN fractions). Their masses, sizes, 
and number densities (which increase with decreasing redshift 
at the same time the number density of quiescent galaxies 
increases) suggest that the highest redshift examples of these 
may be progenitors of compact quiescent z ~ 2 galaxies. These 
galaxies are thus good candidates for transition objects in the 
evolutionary sequence suggested here between the z > 3 SMGs 
and the z ~ 2 quiescent galaxies. The comoving number density 
of the most massive cSFGs (log(M/M Q > 10.8) at 2.5 < z < 3 
is ~(5.4 ± 2.5) x 10 _5 Mpc -3 , which is comparable to the 
number density for z ~ 2 quiescent galaxies. However, the 
cSFGs are not massive enough to be descendants of the brightest 
Z > 3 SMGs or progenitors of most of the massive z ~ 2 
quiescent galaxies considered here, as none of the cSFGs have 
log (M/Mq) >11 (M. Barro 2013, private communication) but 
are likely decedents of less intense starbursts at z > 3 and 
progenitors of slightly lower mass quiescent z = 2 galaxies. 


24 The SMG image in the figure is adopted from Targett et al. (2013). 
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4.3. Caveats and Outlook 

One of the largest uncertainties in the derivables for the z > 3 
SMG sample is associated with their stellar masses. As exten- 
sively discussed in Michalowski et al. (2012a), stellar masses 
for SMGs are highly dependent on the assumed star formation 
history and may differ by up to ±0.5 dex given different as- 
sumptions and models. Dynamical mass considerations may set 
an upper limit to stellar masses. However, the z > 3 SMG sam- 
ples with available dynamical mass estimates are still sparse, as 
well as subject to their own biases. 

The sample of z > 3 SMGs is still small and only partially 
spectroscopically confirmed. Future, larger and deeper mm 
surveys, over multiple fields, will allow for better constraints 
on the evolution of the comoving number density of starburst 
galaxies, to the highest redshifts, and to study the effects of 
cosmic variance. This will allow for more detailed tests and 
modeling of the proposed scenario in different redshifts and 
mass bins, rather than in the single mass bin and two redshift 
ranges that we are limited to with the present data. For example, 
the proposed scenario implies that the significant population of 
Z ~ 2 SMGs should evolve into compact, ~1 Gyr old, massive 
poststarburst galaxies at z ~ 1.5. Interestingly, Bezanson et al. 
(2012) recently published a spectroscopic sample of galaxies 
with exactly these properties. Similarly, if compact quiescent 
galaxies at z > 3 are found in the future, the properties of these 
should match those of the highest redshift z > 5 SMGs. With 
deeper data it will also be possible to push to lower SFRs and 
not only consider the most extreme starbursts. This will likely 
provide a way of fitting the 2.5 < z < 3 cSFGs discussed in 
Section 4.2 into the evolutionary picture. 

Cosmological surface brightness dimming and the large 
amounts (and unknown distribution) of dust in SMGs make 
them extremely faint in the rest-frame UV and optical and likely 
bias the sizes measured, even in very deep NIR imaging data. 
However, we do note that one of the galaxies in our sample 
(AzTEC 1) has been resolved in high-resolution submillimeter 
imaging (Younger et al. 2008), with a derived extent of O'.' 1 —O'/ 2 
and corresponding to a physical size of 1.3-2. 7 kpc, which 
is consistent with the constraints on the effective radius we 
measure from the UltraVISTA data ( r e < 2.6 kpc; see Table 1). 
ALMA will greatly improve estimates of the sizes of high- 
redshift SMGs through high-resolution observations of the rest- 
frame FIR dust continuum. We have argued in this article 
that the observed structural properties are consistent with 
the SMGs being disks or mergers, but the constraints are 
uncertain due to the relatively low S/N and spatial resolution 
of the images, e.g., the Sersic n parameters and effective 
radii could be underestimated due to obscuration by dust and 
cosmological surface brightness dimming. With ALMA it will 
be straightforward to determine redshifts from molecular lines 
and constrain the internal dynamics of the galaxies, e.g., estimate 
velocity dispersions and rotational velocities and search for 
evidence of merging. This will provide powerful diagnostics 
to help map the transformation of the most massive galaxies in 
the universe from enigmatic starbursts at cosmic dawn to dead 
remnants a few gigayears later. 
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